Skip to content

Conversation

thoangtrvn
Copy link
Contributor

@thoangtrvn thoangtrvn commented Mar 21, 2025

This is related to the prior PR: #47

This adds the second Triton kernel (decode stage) to be used in mamba-based model for the inference purpose.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 21, 2025
cache_seqlens: Optional[torch.Tensor] = None,
conv_state_indices: Optional[torch.Tensor] = None,
pad_slot_id: int = PAD_SLOT_ID,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are returning o but it is not listed here in your function signature?

for example: cache_indices = [pad_slot_id, 1 ,20 ,pad_slot_id]
in this case, the kernel will not process entries at
indices 0 and 3
out: (batch, dim) or (batch, dim, seqlen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also inconsistent - out? (vs o)

conv_state_indices=conv_state_indices,
pad_slot_id=pad_slot_id,
)
return o
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but not a fan of using o by itself... out or output etc. makes it more clear imo.

@lessw2020
Copy link
Contributor

Hi @thoangtrvn - thanks for the update and sorry for the delay!
I had a question re: the lack of return signature as well as inconsistent naming (out, then o) which are minor but would be nice to clarify it.
Otherwise looks good!

@thoangtrvn
Copy link
Contributor Author

Thanks @lessw2020 , I'll update base on your feedback.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants